智能论文笔记

Cluster-to-adapt: Few Shot Domain Adaptation for Semantic Segmentation across Disjoint Labels

Tarun Kalluri , Manmohan Chandraker

分类：计算机视觉 | 机器学习

2022-08-04

跨数据集的语义细分的域适应性，由相同类别组成，已经获得了一些最近的成功。但是，更一般的情况是源和目标数据集对应于非重叠标签空间时。例如，分割数据集中的类别根据环境或应用程序的类型发生了很大变化，但共享许多有价值的语义关系。基于特征对齐或差异最小化的现有方法不会考虑此类类别的转移。在这项工作中，我们提出了群集到适应（C2A），这是一种基于计算有效的聚类方法，用于跨分割数据集的域适应性，这些方法完全不同但可能相关类别。我们表明，在变换的特征空间中强制执行的这种聚类目标可以自动选择跨源和目标域的类别，这些类别可以对齐以改善目标性能，同时防止对无关类别的负转移。我们通过实验对室外的挑战性问题进行了实验，以少量拍摄和零拍设置来证明室内适应性的挑战性问题，在所有情况下，性能对现有方法和基准的绩效持续改善。

translated by 谷歌翻译

MemSAC: Memory Augmented Sample Consistency for Large Scale Domain Adaptation

Tarun Kalluri , Astuti Sharma , Manmohan Chandraker

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-25

实用的现实世界数据集具有丰富的类别，为无监督的领域适应带来了新的挑战，例如小型阶层歧视性，仅依靠域不变性的现有方法不能很好地处理。在这项工作中，我们提出了MEMSAC，该MEMSAC利用了跨源和目标域的样本级别相似性，以实现判别性转移，以及扩展到大量类别的体系结构。为此，我们首先引入一种内存增强方法，以在标记的源和未标记的目标域实例之间有效提取成对的相似性关系，该实例适用于处理任意数量的类。接下来，我们建议和理论上证明对比损失的新型变体，以促进阶层内跨域样本之间的局部一致性，同时在类别之间执行分离，从而保留从源到目标的歧视性转移。我们验证了MEMSAC的优势，比以前的最先进的最先进的转移任务有了显着改进。我们还提供了深入的分析和对MEMSAC有效性的见解。

translated by 谷歌翻译

Curator: Creating Large-Scale Curated Labelled Datasets using Self-Supervised Learning

Tarun Narayanan , Ajay Krishnan , Anirudh Koul , Siddha Ganju

分类：计算机视觉

2022-12-28

Applying Machine learning to domains like Earth Sciences is impeded by the lack of labeled data, despite a large corpus of raw data available in such domains. For instance, training a wildfire classifier on satellite imagery requires curating a massive and diverse dataset, which is an expensive and time-consuming process that can span from weeks to months. Searching for relevant examples in over 40 petabytes of unlabelled data requires researchers to manually hunt for such images, much like finding a needle in a haystack. We present a no-code end-to-end pipeline, Curator, which dramatically minimizes the time taken to curate an exhaustive labeled dataset. Curator is able to search massive amounts of unlabelled data by combining self-supervision, scalable nearest neighbor search, and active learning to learn and differentiate image representations. The pipeline can also be readily applied to solve problems across different domains. Overall, the pipeline makes it practical for researchers to go from just one reference image to a comprehensive dataset in a diminutive span of time.

translated by 谷歌翻译

SupeRVol: Super-Resolution Shape and Reflectance Estimation in Inverse Volume Rendering

Mohammed Brahimi , Bjoern Haefner , Tarun Yenamandra , Bastian Goldluecke , Daniel Cremers

分类：计算机视觉

2022-12-09

We propose an end-to-end inverse rendering pipeline called SupeRVol that allows us to recover 3D shape and material parameters from a set of color images in a super-resolution manner. To this end, we represent both the bidirectional reflectance distribution function (BRDF) and the signed distance function (SDF) by multi-layer perceptrons. In order to obtain both the surface shape and its reflectance properties, we revert to a differentiable volume renderer with a physically based illumination model that allows us to decouple reflectance and lighting. This physical model takes into account the effect of the camera's point spread function thereby enabling a reconstruction of shape and material in a super-resolution quality. Experimental validation confirms that SupeRVol achieves state of the art performance in terms of inverse rendering quality. It generates reconstructions that are sharper than the individual input images, making this method ideally suited for 3D modeling from low-resolution imagery.

translated by 谷歌翻译

Re-visiting Reservoir Computing architectures optimized by Evolutionary Algorithms

Sebastián Basterrech , Tarun Kumar Sharma

分类：神经与进化计算 | 计算机视觉 | 机器学习

2022-11-11

For many years, Evolutionary Algorithms (EAs) have been applied to improve Neural Networks (NNs) architectures. They have been used for solving different problems, such as training the networks (adjusting the weights), designing network topology, optimizing global parameters, and selecting features. Here, we provide a systematic brief survey about applications of the EAs on the specific domain of the recurrent NNs named Reservoir Computing (RC). At the beginning of the 2000s, the RC paradigm appeared as a good option for employing recurrent NNs without dealing with the inconveniences of the training algorithms. RC models use a nonlinear dynamic system, with fixed recurrent neural network named the \textit{reservoir}, and learning process is restricted to adjusting a linear parametric function. %so the performance of learning is fast and precise. However, an RC model has several hyper-parameters, therefore EAs are helpful tools to figure out optimal RC architectures. We provide an overview of the results on the area, discuss novel advances, and we present our vision regarding the new trends and still open questions.

translated by 谷歌翻译

Foundation Models for Semantic Novelty in Reinforcement Learning

Tarun Gupta , Peter Karkus , Tong Che , Danfei Xu , Marco Pavone

分类：机器学习 | 人工智能

2022-11-09

Effectively exploring the environment is a key challenge in reinforcement learning (RL). We address this challenge by defining a novel intrinsic reward based on a foundation model, such as contrastive language image pretraining (CLIP), which can encode a wealth of domain-independent semantic visual-language knowledge about the world. Specifically, our intrinsic reward is defined based on pre-trained CLIP embeddings without any fine-tuning or learning on the target RL task. We demonstrate that CLIP-based intrinsic rewards can drive exploration towards semantically meaningful states and outperform state-of-the-art methods in challenging sparse-reward procedurally-generated environments.

translated by 谷歌翻译

Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

Federico Bianchi , Pratyusha Kalluri , Esin Durmus , Faisal Ladhak , Myra Cheng , Debora Nozza , Tatsunori Hashimoto , Dan Jurafsky , James Zou , Aylin Caliskan

分类：自然语言处理 | 计算机视觉

2022-11-07

Machine learning models are now able to convert user-written text descriptions into naturalistic images. These models are available to anyone online and are being used to generate millions of images a day. We investigate these models and find that they amplify dangerous and complex stereotypes. Moreover, we find that the amplified stereotypes are difficult to predict and not easily mitigated by users or model owners. The extent to which these image-generation models perpetuate and amplify stereotypes and their mass deployment is cause for serious concern.

translated by 谷歌翻译

PARSE challenge 2022: Pulmonary Arteries Segmentation using Swin U-Net Transformer(Swin UNETR) and U-Net

Akansh Maurya , Kunal Dashrath Patil , Rohan Padhy , Kalluri Ramakrishna , Ganapathy Krishnamurthi

分类：计算机视觉

2022-08-20

在这项工作中，我们介绍了我们提出的方法，该方法是使用SWIN UNETR和基于U-NET的深神经网络体系结构从CT扫描中分割肺动脉的方法。六个型号，基于SWIN UNETR的三个型号以及基于3D U-NET的三个模型，使用加权平均值来制作最终的分割掩码。我们的团队通过这种方法获得了84.36％的多级骰子得分。我们的工作代码可在以下链接上提供：https：//github.com/akansh12/parse2022。这项工作是Miccai Parse 2022挑战的一部分。

translated by 谷歌翻译

Comparing Baseline Shapley and Integrated Gradients for Local Explanation: Some Additional Insights

Tianshu Feng , Zhipu Zhou , Joshi Tarun , Vijayan N. Nair

分类：机器学习 | (统计)机器学习

2022-08-12

文献中有许多不同的方法来解释机器学习结果。但是，这些方法的方法有所不同，通常没有提供相同的解释。在本文中，我们考虑了两种最新方法：集成梯度（Sundararajan，Taly和Yan，2017年）和基线Shapley（Sundararajan和Najmi，2020年）。原始作者已经研究了两种方法的公理属性，并提供了一些比较。我们的工作为表格数据提供了一些有关其比较行为的其他见解。我们讨论两者提供相同解释及其不同的常见情况。我们还使用仿真研究来检查具有Relu激活函数的神经网络拟合模型时的差异。

translated by 谷歌翻译

Evaluating Table Structure Recognition: A New Perspective

Tarun Kumar , Himanshu Sharad Bhatt

分类：计算机视觉 | 机器学习

2022-07-31

用于评估表结构识别算法的现有指标在捕获文本和空细胞对齐方面存在缺点。在本文中，我们以先前的工作为基础，并提出了一个新的度量标准的IOU相似性（TEDS（iou）），用于表结构识别，该识别使用边界框而不是文本，同时对上述缺点也是强大的。我们通过各种示例证明了对以前的度量标准的有效性。

translated by 谷歌翻译